-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(sql_connector): add support for sql connector #1543
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to 783aa43 in 2 minutes and 11 seconds
More details
- Looked at
357
lines of code in6
files - Skipped
0
files when reviewing. - Skipped posting
8
drafted comments based on config settings.
1. extensions/connectors/sql/pandasai_sql/__init__.py:19
- Draft comment:
Consider using a context manager (with
statement) to ensure that the database connection is properly closed after the query execution. This applies to all database connection functions in this file. - Reason this comment was not posted:
Comment was on unchanged code.
2. pandasai/__init__.py:43
- Draft comment:
Consider using a structured object likeSQLConnectionConfig
orSqliteConnectionConfig
for theconnector
parameter instead of a dictionary. This will provide better type safety and clarity. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The suggestion seems reasonable since it would provide better type safety and IDE support. However, the code already has validation for the connector type through SQL_SOURCE_TYPES check. The change would be a refactor rather than fixing a clear issue. The comment doesn't provide strong evidence that this change is necessary.
The suggestion could improve code quality, but it's more of an optional enhancement than a required fix. The current implementation with dict validation works fine.
While using structured objects would be nice, the current implementation is functional and validated. This suggestion falls into the category of optional refactoring rather than a necessary change.
Delete the comment as it suggests an optional enhancement rather than pointing out a clear issue that needs to be fixed.
3. pandasai/data_loader/loader.py:60
- Draft comment:
Ensureself.schema
is notNone
before accessingself.schema.source.type
to avoid potentialAttributeError
. - Reason this comment was not posted:
Comment did not seem useful.
4. extensions/connectors/sql/pandasai_sql/__init__.py:9
- Draft comment:
New functionality for SQL connectors is introduced, but there are no corresponding unit tests. Please add unit tests to ensure this new functionality is tested. - Reason this comment was not posted:
Comment was on unchanged code.
5. extensions/connectors/sql/pandasai_sql/__init__.py:42
- Draft comment:
The function nameload_from_cockroachdb
is inconsistent with the naming pattern of other functions likeload_from_mysql
andload_from_postgres
. Consider renaming it toload_from_cockroach
for consistency. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is actually suggesting to revert a deliberate change made in this PR. The new nameload_from_cockroachdb
is more precise and matches the official product name "CockroachDB". The old name was less accurate. The change appears intentional and makes the code better, not worse.
Maybe there's a good reason for shorter names in function signatures? Maybe there's a team naming convention I'm not aware of?
Even if there was a convention, using the full, official product name is clearer and more professional. The change makes the code better.
The comment should be deleted as it's suggesting to revert an intentional improvement to the function name.
6. pandasai/__init__.py:40
- Draft comment:
New functionality for SQL connectors is introduced, but there are no corresponding unit tests. Please add unit tests to ensure this new functionality is tested. - Reason this comment was not posted:
Marked as duplicate.
7. pandasai/data_loader/loader.py:29
- Draft comment:
New functionality for SQL connectors is introduced, but there are no corresponding unit tests. Please add unit tests to ensure this new functionality is tested. - Reason this comment was not posted:
Marked as duplicate.
8. pandasai/data_loader/semantic_layer_schema.py:22
- Draft comment:
New functionality for SQL connectors is introduced, but there are no corresponding unit tests. Please add unit tests to ensure this new functionality is tested. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_geewBNZVJ8tnNvhH
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## release/v3 #1543 +/- ##
==============================================
+ Coverage 79.09% 79.63% +0.53%
==============================================
Files 149 59 -90
Lines 6013 2013 -4000
==============================================
- Hits 4756 1603 -3153
+ Misses 1257 410 -847
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to 148d585 in 1 minute and 37 seconds
More details
- Looked at
1582
lines of code in11
files - Skipped
0
files when reviewing. - Skipped posting
7
drafted comments based on config settings.
1. extensions/connectors/sql/pandasai_sql/__init__.py:38
- Draft comment:
Ensureconnection_info
is an instance ofSqliteConnectionConfig
for clarity when usingsqlite3.connect(connection_info.file_path)
. - Reason this comment was not posted:
Confidence changes required:50%
In theload_from_sqlite
function, the connection is established usingsqlite3.connect(connection_info.file_path)
. This is correct, but the comment should mention thatconnection_info
is an instance ofSqliteConnectionConfig
for clarity.
2. extensions/connectors/sql/pandasai_sql/__init__.py:42
- Draft comment:
Ensure the function nameload_from_cockroachdb
matches the loader function name in the__all__
list to avoid confusion. - Reason this comment was not posted:
Confidence changes required:50%
In theload_from_cockroachdb
function, the connection is established usingpsycopg2.connect
. The function name should match the loader function name in the__all__
list to avoid confusion.
3. pandasai/__init__.py:59
- Draft comment:
The error message should specifydf must be a PandasAI DataFrame
for consistency with the class name. - Reason this comment was not posted:
Confidence changes required:50%
In thecreate
function, the check fordf
being aPandaAI DataFrame
is correct, but the error message should specifyPandasAI DataFrame
for consistency with the class name.
4. pandasai/data_loader/semantic_layer_schema.py:101
- Draft comment:
Clarify the comment indicating that SQLite is a special case for local sources to avoid confusion. - Reason this comment was not posted:
Confidence changes required:50%
In theSemanticLayerSchema
class, thevalidate_type_and_fields
method has a comment indicating that SQLite is a special case for local sources. This should be clarified in the code to avoid confusion.
5. docs/v3/semantic-layer.mdx:138
- Draft comment:
The type for theconnector
field is incorrectly mentioned asDataFrame
. It should be a dictionary specifying connection details. Please update the documentation accordingly. - Reason this comment was not posted:
Marked as duplicate.
6. pandasai/__init__.py:59
- Draft comment:
raise ValueError("df must be a PandaAI DataFrame")
- Reason this comment was not posted:
Comment looked like it was already resolved.
7. pandasai/__init__.py:58
- Draft comment:
raise ValueError("df must be a PandaAI DataFrame")
- Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_QarLLfonNjse9mY2
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 6610d38 in 25 seconds
More details
- Looked at
56
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
6
drafted comments based on config settings.
1. pandasai/data_loader/loader.py:4
- Draft comment:
Unused import 'datetime' and 'timedelta' removed. Good cleanup. - Reason this comment was not posted:
Confidence changes required:0%
The import statement for 'datetime' and 'timedelta' is removed but not used in the file. This is a good cleanup.
2. pandasai/dataframe/base.py:242
- Draft comment:
'data_loader' parameter added to 'pull' method. Consistent with changes in 'virtual_dataframe.py'. - Reason this comment was not posted:
Confidence changes required:0%
The 'pull' method in 'base.py' and 'virtual_dataframe.py' has been updated to include 'data_loader' as a parameter. This change is consistent across both files and aligns with the intent of adding SQL connector support.
3. pandasai/dataframe/virtual_dataframe.py:39
- Draft comment:
'name' and 'description' are now popped from kwargs to prevent unintended passing to superclass. - Reason this comment was not posted:
Confidence changes required:0%
The 'name' and 'description' parameters are now being popped from kwargs in 'VirtualDataFrame'. This change ensures that these parameters are correctly handled and not passed to the superclass unintentionally.
4. pandasai/dataframe/virtual_dataframe.py:37
- Draft comment:
Add a period at the end of the error message for consistency.
raise VirtualizationError("Schema is required for virtualization!")
- Reason this comment was not posted:
Confidence changes required:10%
The error message in line 37 is missing a period at the end. This is a minor grammar issue but should be corrected for consistency.
5. pandasai/dataframe/base.py:239
- Draft comment:
The reinitialization of the DataFrame in thepull
method is repeated invirtual_dataframe.py
. Consider refactoring to avoid code duplication. - Reason this comment was not posted:
Confidence changes required:50%
The functionpull
inbase.py
andvirtual_dataframe.py
have similar code for reinitializing the DataFrame. This violates the DRY principle.
6. pandasai/dataframe/virtual_dataframe.py:48
- Draft comment:
The reinitialization of the DataFrame in thepull
method is repeated inbase.py
. Consider refactoring to avoid code duplication. - Reason this comment was not posted:
Confidence changes required:50%
The functionpull
inbase.py
andvirtual_dataframe.py
have similar code for reinitializing the DataFrame. This violates the DRY principle.
Workflow ID: wflow_IhG6yIOFkpnLao9d
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 52b4863 in 22 seconds
More details
- Looked at
19
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. docs/v3/semantic-layer.mdx:135
- Draft comment:
The change fromDataFrame
toDict
is inconsistent with the rest of the document, which still referencesDataFrame
. Ensure consistency across the documentation. - Reason this comment was not posted:
Comment did not seem useful.
2. docs/v3/semantic-layer.mdx:135
- Draft comment:
Ensure that the documentation is updated to reflect the change from using a DataFrame to a SQL connector source dict. This includes updating any examples, descriptions, and ensuring that tests are updated accordingly. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_89EW0oTbzuGlbl1X
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good to me 👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 28e9510 in 27 seconds
More details
- Looked at
12
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. pandasai/__init__.py:112
- Draft comment:
The removal ofschema = df.schema
seems incorrect.schema
is used later and should be initialized withdf.schema
whendf
is notNone
. This could lead to aNameError
. - Reason this comment was not posted:
Comment was not on a valid diff hunk.
2. pandasai/__init__.py:112
- Draft comment:
The removal ofschema = df.schema
seems incorrect asschema
is used later in the code. Consider restoring this line. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_sXrC019EUxW7iOa0
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Important
Add support for SQL connectors in
pandasai
, enabling connections to MySQL, PostgreSQL, and SQLite, with updated documentation and tests.create()
function inpandasai/__init__.py
, allowing connection to MySQL, PostgreSQL, and SQLite.SQLConnectionConfig
andSqliteConnectionConfig
insemantic_layer_schema.py
for connection configurations.DatasetLoader
inloader.py
to handle SQL sources and execute queries.load_from_mysql
,load_from_postgres
,load_from_sqlite
, andload_from_cockroachdb
inpandasai_sql/__init__.py
.test_sql.py
,test_loader.py
, andtest_pandasai_init.py
to verify SQL connector functionality.create()
handles both DataFrame and connector inputs correctly.semantic-layer.mdx
to include examples and descriptions for using SQL connectors.This description was created by
for 28e9510. It will automatically update as commits are pushed.